Omnifluent English-to-French and Russian-to-English Systems for the 2013 Workshop on Statistical Machine Translation

نویسندگان

  • Evgeny Matusov
  • Gregor Leusch
چکیده

This paper describes OmnifluentTM Translate – a state-of-the-art hybrid MT system capable of high-quality, high-speed translations of text and speech. The system participated in the English-to-French and Russian-to-English WMT evaluation tasks with competitive results. The features which contributed the most to high translation quality were training data sub-sampling methods, document-specific models, as well as rule-based morphological normalization for Russian. The latter improved the baseline Russian-to-English BLEU score from 30.1 to 31.3% on a heldout test set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OmnifluentTM English-to-French and Russian-to-English Systems for the 2013 Workshop on Statistical Machine Translation

This paper describes OmnifluentTM Translate – a state-of-the-art hybrid MT system capable of high-quality, high-speed translations of text and speech. The system participated in the English-to-French and Russian-to-English WMT evaluation tasks with competitive results. The features which contributed the most to high translation quality were training data sub-sampling methods, document-specific ...

متن کامل

Munich-Edinburgh-Stuttgart Submissions of OSM Systems at WMT13

This paper describes Munich-EdinburghStuttgart’s submissions to the Eighth Workshop on Statistical Machine Translation. We report results of the translation tasks from German, Spanish, Czech and Russian into English and from English to German, Spanish, Czech, French and Russian. The systems described in this paper use OSM (Operation Sequence Model). We explain different pre-/post-processing ste...

متن کامل

Yandex School of Data Analysis Machine Translation Systems for WMT13

This paper describes the English-Russian and Russian-English statistical machine translation (SMT) systems developed at Yandex School of Data Analysis for the shared translation task of the ACL 2013 Eighth Workshop on Statistical Machine Translation. We adopted phrase-based SMT approach and evaluated a number of different techniques, including data filtering, spelling correction, alignment of l...

متن کامل

QCRI-MES Submission at WMT13: Using Transliteration Mining to Improve Statistical Machine Translation

This paper describes QCRI-MES’s submission on the English-Russian dataset to the Eighth Workshop on Statistical Machine Translation. We generate improved word alignment of the training data by incorporating an unsupervised transliteration mining module to GIZA++ and build a phrase-based machine translation system. For tuning, we use a variation of PRO which provides better weights by optimizing...

متن کامل

The Heidelberg University Machine Translation Systems for IWSLT 2013

We present our systems for the machine translation evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2013. We submitted systems for three language directions: German-to-English, Russianto-English and English-to-Russian. The focus of our approaches lies on effective usage of the in-domain parallel training data. Therefore, we use the training data to tune p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013